v0.2 Status

Just started planning the changes

Status

  • v0.2 uses json-to-mysql v1.0
  • v0.3 uses json-to-mysql v2.0

TODO:

  • clean up this document & remove duplicated notes

Where I'm At

  • I wrote
    • convert to do all conversions
    • convert_to_csv to convert .ods to csv
    • clean_csv to clean the convert_to_csv output
    • convert_to_json to convert CLEAN csv to json
  • I copied in
    • clean_row
    • clean_value
  • I wrote tests
    • testCleanCsv to test clean_csv method
    • testCsvToJson to test convert_to_json method
  • I need to
    • convert_to_csv: Consider moving the clean_csv call OUT of it & instead doing this in the if ($clean_csv) conitional of convert
      • Alternate (or additional): Accept optional path(s) for the output file location(s)
    • test convert_to_csv
    • write convert_to_sql (mostly just copy, then edit)
    • write convert_to_sqlite (mostly just copy)
      • probably should be a different name, bc it's just executing sql, NOT really converting.

TODO

  • DONE Create a proper data/file structure for the tests with at least two data sources
  • DONE integrate phptests & write a proper test case
  • DONE Write \Tlf\DataConverter class
  • IN PROGRESS fill out methods one at a time from the running it section below
  • Update the README.src.md to make sure the example is correct
  • run scrawl & push
  • update default branch to v0.2

Notes from my phone (Oct 28, 2021)

  • change json-to-mysql into general purpose data converter. Create a liaison app to facilitate it all via a data dir. Probably new Addon($package) (or new Compo(...) Until i update liaison. This enables data conversion & display for the given package.
  • data lib new set up (add just one data source) or a dir containing multiple data sources. (Can one data source have multiple spreadsheets???). Structure: AppDir/data/ idph/ config.json - or can just use a global config for all the data dirs & allow per-source overrides source.[ods|csv|json] out.json out.csv etc... files/ ... Just associated files cdc/ source.ods ....

Goals

Quality of Life features

  • meta.json file is optional initally (one will be created for you)

Running it

$converter = new \Tlf\DataConverter();
$converter->addSource($dir.'source-name');
$converter->addSource($dir.'source-name2');
$converter->convert();

$meta = $converter->meta('ns:source-name'); // return meta.json as array 
$metaMd = $converter->metaMd('ns:source-name'); // return contents of meta.md
$metaMdPath = $converter->metaMdPath('ns:source-name'); // return path to meta.md file

$out = $converter->files('ns:source-name'); // return array of paths to files
$out == ['json'=>'/path/to/out.json', 'sql'=>'/path/to/out.sql', ...];

File structure:

source-name/ 
    meta.json ## define source name, sqlite table name, offset of table headers, offset of first data row ... idk what else
    meta.md   ## a meta information file intended for human consumption / delivery on a web page
    files/
        ... any relevant input files, such as images, source data, web pages, whatever is relevant
    source.ods ## in v0.3 I want source to allow any of these file types
    out.json
    out.csv
    out.ods ## just a copy of source.ods??
    out.sql ## combines CREATE TABLE & INSERT statement files from json-to-mysql output
    out.sqlite
    out/
        ... output files from json-to-mysql

Maybe for v0.3

  • Create a conversions log file, so we know every time a new conversion was done & maybe how many rows were added (or changed? Probably not diffing to that degree though)
  • Allow custom offsets & stuff for source data files (currently has a very strict setup)
  • start with any available format & convert to all the others
  • Add extensible class that each data source CAN implement (perhaps in a meta.php file) to customize certain features

Code to consider

This is code used on a site of mine to help me display meta files on the web. I want to correlate a data name (like case/mchd-daily) with a set of data outputs

<section>
<?php

$view = $view_category .'/'. $view_name;
$map = [
    'case/mchd-daily'=>'mchd-daily-cases',
    'case/mchd-weekly'=>'mchd-weekly-cases',
    'death/mchd-daily'=>'mchd-daily-deaths',
    'death/mchd-weekly'=>'mchd-weekly-deaths',
    'death/rates-by-county'=>'death-rates',

    'hospital/mhs'=>'mhs-hospitalizations',
    'hospital/hshs'=>'hshs-hospitalizations',

    'idph/daily'=>'idph-cases',
    'idph/weekly'=>'idph-cases',

    'vacc/ByDay'=>'cdc-vaccinations',
    'vacc/locations'=>'vaccine-locations',

    'variant/mchd'=>'mchd-variants',
];

$view = str_replace('_', '-', $view);
if (!isset($map[$view])){
    echo "There was an error.";
    return;
}

$meta_file = $map[$view].'.md';


$root = $_SERVER['DOCUMENT_ROOT'];
$data_dir = $root.'/vendor/taeluf/data.macon-county-covid/data/';

$metaFile = $data_dir.$meta_file;
if (is_file($metaFile)){
    echo "<markdown>\n"
        .file_get_contents($metaFile)
        ."\n</markdown>"
        ;
}

echo $lia->view('covid:downloads', ['by'=>'source', 'filter'=>$map[$view]]);

echo $lia->phad($view, ['access.name'=>'all']);

?>

</section>